Krzysztof Tomala - Explainable AI - Homework 2

Report

List of features with short descriptions

Age : Age of the patient

Sex : Sex of the patient

exang: exercise induced angina (1 = yes; 0 = no)

ca: number of major vessels visible in fluoroscopy (0-3)

cp : Chest Pain type chest pain type Value 1: typical angina Value 2: atypical angina Value 3: non-anginal pain Value 4: asymptomatic

trtbps : resting blood pressure (in mm Hg)

chol : cholestoral in mg/dl fetched via BMI sensor

fbs : (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)

rest_ecg : resting electrocardiographic results Value 0: normal Value 1: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV) Value 2: showing probable or definite left ventricular hypertrophy by Estes' criteria

thalach : maximum heart rate achieved

thal : Thalium Stress Test result

slope : the slope of the peak exercise ST segment (2 = upsloping; 1 = flat; 0 = downsloping)

oldpeak : ST depression induced by exercise relative to rest

Tasks

Find any two observations in the dataset, such that they have different variables of the highest importance, e.g. age and gender have the highest (absolute) attribution for observation A, but race and class are more important for observation B.

Waterfall plots of the SHAP values of observations:

image

image

The observations have different variables of the highest importance. In case of the first observation it is oldpeak, which is pretty high, and in the second case it is the nonvisibilty of the major vessels on the fluoroscopy.

(If possible) Select one variable X and find two observations in the dataset such that for one observation, X has a positive attribution, and for the other observation, X has a negative attribution.

On the same observations as before we can see, that in the first one oldpeak is pretty high and it has negative attribution to the prediction, but in the second observation the oldpeak is low and its attribuion is positive

(How) Do the results differ across the two packages selected in point (3)?

The results from dalex are presented on the following plots:

image

We can see that the results are pretty similar. Since there is some randomness In both cases most important variable is the same and the feature attribution have the same signs in both packages.

(Using one explanation package of choice) Train another model of any class: neural network, linear model, decision tree etc. and find an observation for which SHAP attributions are different between this model and the one trained in point (1).

image

image

The first plot was created for the random forest classifier and the second one for the logistic regression.\ We can clearly see, that logistic regressin is looking much more at the ca parameter, then the forest.

Comment on the results obtained in points (4)-(7)

I think that the results are pretty much what was to be expected.
When it comes to the task 4. if we 2 observation, one of which has some feature with some extreme value of that feature, while the other has value close the the average one and has some other features with extreme values, then it is very logicall that the model looks at the features that have extream values, because they contain a lot of informations about our observation.
Regarding taks 5 if we take 2 observations that have very different values of some feature, then we probably should expect that one attribution will be positive and one will be negative.
When it comes to just numerical results I think that they are very similiar, but I have to say that I find shap plots to be more visually appealing.
It make sense, that different models have different ability to look at the features. Some models focuses mostly on just some subset of features (like linear models with L1 penalty), while other models always look at all of them.

Appendix

In [1]:
import pandas as pd
import sklearn
from sklearn import ensemble
import dalex as dx
import shap
In [2]:
dataset = pd.read_csv('heart.csv')
dataset = pd.get_dummies(dataset)
dataset
Out[2]:
age sex cp trtbps chol fbs restecg thalachh exng oldpeak slp caa thall output
0 63 1 3 145 233 1 0 150 0 2.3 0 0 1 1
1 37 1 2 130 250 0 1 187 0 3.5 0 0 2 1
2 41 0 1 130 204 0 0 172 0 1.4 2 0 2 1
3 56 1 1 120 236 0 1 178 0 0.8 2 0 2 1
4 57 0 0 120 354 0 1 163 1 0.6 2 0 2 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
298 57 0 0 140 241 0 1 123 1 0.2 1 0 3 0
299 45 1 3 110 264 0 1 132 0 1.2 1 0 3 0
300 68 1 0 144 193 1 1 141 0 3.4 1 2 3 0
301 57 1 0 130 131 0 1 115 1 1.2 1 1 3 0
302 57 0 1 130 236 0 0 174 0 0.0 1 1 2 0

303 rows × 14 columns

In [3]:
features = dataset.drop(columns='output')

#fixing typo in data
features['thalach']=features['thalachh']
features = features.drop(columns='thalachh')

features['slope']=features['slp']
features = features.drop(columns='slp')

features['ca']=features['caa']
features = features.drop(columns='caa')

features = pd.get_dummies(features, columns=['cp', 'thall'])

features
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(features, dataset['output'], test_size=0.3, random_state=0)
X_train
Out[3]:
age sex trtbps chol fbs restecg exng oldpeak thalach slope ca cp_0 cp_1 cp_2 cp_3 thall_0 thall_1 thall_2 thall_3
137 62 1 128 208 1 0 0 0.0 140 2 0 0 1 0 0 0 0 1 0
106 69 1 160 234 1 0 0 0.1 131 1 1 0 0 0 1 0 0 1 0
284 61 1 140 207 0 0 1 1.9 138 2 1 1 0 0 0 0 0 0 1
44 39 1 140 321 0 0 0 0.0 182 2 0 0 0 1 0 0 0 1 0
139 64 1 128 263 0 1 1 0.2 105 1 1 1 0 0 0 0 0 0 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
251 43 1 132 247 1 0 1 0.1 143 1 4 1 0 0 0 0 0 0 1
192 54 1 120 188 0 1 0 1.4 113 1 1 1 0 0 0 0 0 0 1
117 56 1 120 193 0 0 0 1.9 162 1 0 0 0 0 1 0 0 0 1
47 47 1 138 257 0 0 0 0.0 156 2 0 0 0 1 0 0 0 1 0
172 58 1 120 284 0 0 0 1.8 160 1 0 0 1 0 0 0 0 1 0

212 rows × 19 columns

In [ ]:
 
In [4]:
forest = sklearn.ensemble.RandomForestClassifier()
forest.fit(X=X_train,y=y_train)
print(f'Accuracy: {sklearn.metrics.accuracy_score(y_test,forest.predict(X_test))}')
print(f'Recall: {sklearn.metrics.recall_score(y_test,forest.predict(X_test))}')
print(f'Precision: {sklearn.metrics.precision_score(y_test,forest.predict(X_test))}')

forest_accuracy = sklearn.metrics.accuracy_score(y_test,forest.predict(X_test))
forest_recall = sklearn.metrics.recall_score(y_test,forest.predict(X_test))
forest_precision = sklearn.metrics.precision_score(y_test,forest.predict(X_test))

print('\nResults on train dataset:')
print(f'Accuracy: {sklearn.metrics.accuracy_score(y_train,forest.predict(X_train))}')
print(f'Recall: {sklearn.metrics.recall_score(y_train,forest.predict(X_train))}')
print(f'Precision: {sklearn.metrics.precision_score(y_train,forest.predict(X_train))}')
Accuracy: 0.8241758241758241
Recall: 0.8936170212765957
Precision: 0.7924528301886793

Results on train dataset:
Accuracy: 1.0
Recall: 1.0
Precision: 1.0
In [5]:
len(X_test)
Out[5]:
91
In [6]:
obs = [0, 1]
In [7]:
obs1 = X_test.iloc[obs[0]].to_numpy().reshape(1,-1)
print(obs1)
forest.predict(obs1)
[[ 70.    1.  145.  174.    0.    1.    1.    2.6 125.    0.    0.    1.
    0.    0.    0.    0.    0.    0.    1. ]]
X does not have valid feature names, but RandomForestClassifier was fitted with feature names
Out[7]:
array([0])
In [8]:
obs2 = X_test.iloc[obs[1]].to_numpy().reshape(1,-1)
print(obs2)
forest.predict(obs2)
[[ 64.    1.  170.  227.    0.    0.    0.    0.6 155.    1.    0.    0.
    0.    0.    1.    0.    0.    0.    1. ]]
X does not have valid feature names, but RandomForestClassifier was fitted with feature names
Out[8]:
array([1])
In [9]:
model = forest
X = X_test
y = y_test
predict = lambda m, d: m.predict(d)
explainer = dx.Explainer(forest, X_test, y_test, predict_function=predict, label="GBM")
Preparation of a new explainer is initiated

  -> data              : 91 rows 19 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 91 values
  -> model_class       : sklearn.ensemble._forest.RandomForestClassifier (default)
  -> label             : GBM
  -> predict function  : <function <lambda> at 0x7f384028daf0> will be used
  -> predict function  : Accepts pandas.DataFrame and numpy.ndarray.
  -> predicted values  : min = 0.0, mean = 0.582, max = 1.0
  -> model type        : classification will be used (default)
  -> residual function : difference between y and yhat (default)
  -> residuals         : min = -1.0, mean = -0.0659, max = 1.0
  -> model_info        : package sklearn

A new explainer has been created!
X does not have valid feature names, but RandomForestClassifier was fitted with feature names
In [10]:
explainer.model_performance()
Out[10]:
recall precision f1 accuracy auc
GBM 0.893617 0.792453 0.84 0.824176 0.821809
In [11]:
shap_attributions = [explainer.predict_parts(X.iloc[[i]], type="shap", label=f'Person {i}') for i in obs]
In [12]:
shap_attributions
Out[12]:
[<dalex.predict_explanations._shap.object.Shap at 0x7f38402bd760>,
 <dalex.predict_explanations._shap.object.Shap at 0x7f38403ba1f0>]
In [13]:
shap_attributions[0].plot(shap_attributions[1::])
In [14]:
bd_attributions = [explainer.predict_parts(X.iloc[[i]], type="break_down", label=f'Person {i}') for i in obs]
In [15]:
bd_attributions[0].plot(bd_attributions[1::])
In [16]:
X
Out[16]:
age sex trtbps chol fbs restecg exng oldpeak thalach slope ca cp_0 cp_1 cp_2 cp_3 thall_0 thall_1 thall_2 thall_3
225 70 1 145 174 0 1 1 2.6 125 0 0 1 0 0 0 0 0 0 1
152 64 1 170 227 0 0 0 0.6 155 1 0 0 0 0 1 0 0 0 1
228 59 1 170 288 0 0 0 0.2 159 1 0 0 0 0 1 0 0 0 1
201 60 1 125 258 0 0 1 2.8 141 1 1 1 0 0 0 0 0 0 1
52 62 1 130 231 0 1 0 1.8 146 1 3 0 0 1 0 0 0 0 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
253 67 1 100 299 0 0 1 0.9 125 1 2 1 0 0 0 0 0 1 0
293 67 1 152 212 0 0 0 0.8 150 1 0 0 0 1 0 0 0 0 1
76 51 1 125 245 1 0 0 2.4 166 1 0 0 0 1 0 0 0 1 0
272 67 1 120 237 0 1 0 1.0 71 1 0 1 0 0 0 0 0 1 0
238 77 1 125 304 0 0 1 0.0 162 2 3 1 0 0 0 0 0 1 0

91 rows × 19 columns

In [25]:
shap_explainer = shap.explainers.Tree(forest, data=X, model_output="probability")
shap_values = shap_explainer(X)[:,:,1]
shap_values
Out[25]:
.values =
array([[ 4.14653751e-02, -8.91374489e-03, -2.76074043e-02, ...,
        -8.75457858e-04, -1.02183846e-01, -4.41006886e-02],
       [-9.09222897e-03, -1.07833591e-02, -2.23831761e-02, ...,
         7.32600738e-05, -6.30189669e-02, -2.93406589e-02],
       [-1.84068546e-02, -1.23489007e-02, -1.86057038e-02, ...,
         1.73992676e-04, -5.90549439e-02, -3.08107004e-02],
       ...,
       [ 3.17397955e-02, -1.45729980e-02,  8.48639435e-03, ...,
        -2.39926729e-04,  9.14165340e-02,  3.15158288e-02],
       [-8.78414412e-03, -2.84133955e-02,  1.50030088e-02, ...,
         2.12454208e-04,  8.06461694e-02,  3.37247947e-02],
       [ 1.33303244e-02, -1.62073519e-02,  7.37218764e-03, ...,
         5.86080581e-04,  8.01006003e-02,  3.23715328e-02]])

.base_values =
array([0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769, 0.56230769, 0.56230769, 0.56230769, 0.56230769,
       0.56230769])

.data =
array([[ 70.,   1., 145., ...,   0.,   0.,   1.],
       [ 64.,   1., 170., ...,   0.,   0.,   1.],
       [ 59.,   1., 170., ...,   0.,   0.,   1.],
       ...,
       [ 51.,   1., 125., ...,   0.,   1.,   0.],
       [ 67.,   1., 120., ...,   0.,   1.,   0.],
       [ 77.,   1., 125., ...,   0.,   1.,   0.]])
In [26]:
for i in range(10):
    shap.plots.waterfall(shap_values[i])
shap_values_forest = shap_values[4]
In [19]:
shap.plots.beeswarm(shap_values, max_display=10, plot_size=(9, 6))
In [20]:
import matplotlib.pyplot as plt
# plots.bar() has no plot_size parameter
shap.plots.bar(shap_values, max_display=10, show=False) 
plt.gcf().set_size_inches(9, 6)
plt.show()
In [21]:
import sklearn.linear_model
model = sklearn.linear_model.LogisticRegression(max_iter=500)
model.fit(X=X_train,y=y_train)
print(f'Accuracy: {sklearn.metrics.accuracy_score(y_test,model.predict(X_test))}')
print(f'Recall: {sklearn.metrics.recall_score(y_test,model.predict(X_test))}')
print(f'Precision: {sklearn.metrics.precision_score(y_test,model.predict(X_test))}')

model_accuracy = sklearn.metrics.accuracy_score(y_test,model.predict(X_test))
model_recall = sklearn.metrics.recall_score(y_test,model.predict(X_test))
model_precision = sklearn.metrics.precision_score(y_test,model.predict(X_test))

print('\nResults on train dataset:')
print(f'Accuracy: {sklearn.metrics.accuracy_score(y_train,model.predict(X_train))}')
print(f'Recall: {sklearn.metrics.recall_score(y_train,model.predict(X_train))}')
print(f'Precision: {sklearn.metrics.precision_score(y_train,model.predict(X_train))}')
Accuracy: 0.8241758241758241
Recall: 0.8936170212765957
Precision: 0.7924528301886793

Results on train dataset:
Accuracy: 0.8867924528301887
Recall: 0.940677966101695
Precision: 0.8671875
In [22]:
import warnings
shap_explainer = shap.Explainer(lambda x: model.predict_proba(x)[:, 1], X)
shap_values = shap_explainer(X)
shap_values
Out[22]:
.values =
array([[ 0.04654411, -0.03322153, -0.01690747, ..., -0.00080763,
        -0.0773779 , -0.06527902],
       [ 0.04075304, -0.03533344, -0.06350631, ..., -0.00090586,
        -0.07213566, -0.06798278],
       [ 0.02068742, -0.03714204, -0.06503483, ..., -0.00094165,
        -0.0752954 , -0.06901546],
       ...,
       [-0.01284611, -0.03111286,  0.01274013, ..., -0.0008187 ,
         0.06770172,  0.05527043],
       [ 0.04999217, -0.04120158,  0.01998245, ..., -0.00096859,
         0.0594222 ,  0.04676461],
       [ 0.08098693, -0.04073264,  0.01036298, ..., -0.00073059,
         0.04617049,  0.04291781]])

.base_values =
array([0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035, 0.53981035, 0.53981035, 0.53981035, 0.53981035,
       0.53981035])

.data =
array([[ 70.,   1., 145., ...,   0.,   0.,   1.],
       [ 64.,   1., 170., ...,   0.,   0.,   1.],
       [ 59.,   1., 170., ...,   0.,   0.,   1.],
       ...,
       [ 51.,   1., 125., ...,   0.,   1.,   0.],
       [ 67.,   1., 120., ...,   0.,   1.,   0.],
       [ 77.,   1., 125., ...,   0.,   1.,   0.]])
In [23]:
for i in range(10):
    shap.plots.waterfall(shap_values[i])
shap_values_lr = shap_values[4]
In [ ]:
 
In [24]:
shap.plots.waterfall(shap_values_forest)
shap.plots.waterfall(shap_values_lr)
In [ ]: